-
Notifications
You must be signed in to change notification settings - Fork 494
loki.source.file: refactor tail package #5003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
kalleep
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what I should put in the changelog, any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the internal tail package to simplify its API and reduce resource usage. The refactoring eliminates 4 goroutines per tailed file by replacing the channel-based design with direct method calls (Next() and Stop()), consolidating position updates into the main read loop, and removing the loki.EntryHandler wrapper. The new design reduces memory overhead and complexity while maintaining all existing functionality.
Key Changes:
- Simplified tail package API from channel-based to blocking method calls with
Next()andStop()methods - Added
Offsetfield toLinestruct to enable inline position tracking without a separate goroutine - Consolidated file watching logic into standalone blocking functions instead of background goroutines
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
internal/component/loki/source/file/tailer.go |
Refactored to use new tail API, removed separate position update goroutine, merged label handling into entry creation |
internal/component/loki/source/file/tailer_test.go |
Updated test to accommodate new tailer startup behavior |
internal/component/loki/source/file/internal/tail/file.go |
New implementation of file tailing with blocking Next() method |
internal/component/loki/source/file/internal/tail/file_test.go |
Comprehensive test suite for new File implementation |
internal/component/loki/source/file/internal/tail/line.go |
Added Line struct with Offset field for position tracking |
internal/component/loki/source/file/internal/tail/config.go |
New configuration structures for File and WatcherConfig |
internal/component/loki/source/file/internal/tail/block.go |
Standalone blocking functions for file existence and event detection |
internal/component/loki/source/file/internal/tail/fileext/file_*.go |
Consolidated and renamed from winfile package, includes platform-specific file operations |
internal/component/loki/source/file/internal/tail/watch/*.go |
Removed old watch package with channel-based file watching |
internal/component/loki/source/file/internal/tail/tail.go |
Removed old channel-based tail implementation |
internal/component/loki/source/file/internal/tail/util/util.go |
Removed unused utility package |
internal/component/loki/source/file/internal/tail/README.md |
Removed outdated README referencing old implementation |
Comments suppressed due to low confidence (1)
internal/component/loki/source/file/internal/tail/fileext/file_windows.go:104
- Typo in comment: "ended information" should be "extended information" to match the description above.
| MinBackoff: defaultWatcherConfig.MaxPollFrequency, | ||
| MaxBackoff: defaultWatcherConfig.MaxPollFrequency, |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The backoff configuration uses defaultWatcherConfig.MaxPollFrequency instead of f.cfg.WatcherConfig.MaxPollFrequency. This means that even if the user configured custom polling frequencies, the reopen logic will always use the default values (250ms). This is inconsistent with the rest of the file watching behavior and could lead to unexpected delays or polling frequencies during file reopens. Consider using f.cfg.WatcherConfig.MaxPollFrequency instead.
| MinBackoff: defaultWatcherConfig.MaxPollFrequency, | |
| MaxBackoff: defaultWatcherConfig.MaxPollFrequency, | |
| MinBackoff: f.cfg.WatcherConfig.MaxPollFrequency, | |
| MaxBackoff: f.cfg.WatcherConfig.MaxPollFrequency, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, AI's comment seems valid here, can you leave a comment to explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the original implementation had this too https://github.com/grafana/alloy/blob/main/internal/component/loki/source/file/internal/tail/tail.go#L229
So I just kept the same behavior but we can for sure change this
9d7217a to
1fa3763
Compare
4d413e9 to
1fa3763
Compare
|
|
||
| line = strings.TrimRight(line, "\n") | ||
| // Trim Windows line endings | ||
| line = strings.TrimSuffix(line, "\r") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just TrimRight "\r\n"? TrimRight will remove all of the trailing characters that are in the set, so it'll remove anything, and the current behavior will only trim a single newline on Windows but all newlines on non-Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what was there before but you are right we can just use strings.TrimRight("\r\n") 👍
| MinBackoff: defaultWatcherConfig.MaxPollFrequency, | ||
| MaxBackoff: defaultWatcherConfig.MaxPollFrequency, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, AI's comment seems valid here, can you leave a comment to explain?
27d7fc2 to
12dcebf
Compare
🔍 Dependency Reviewgopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 -> removed — ✅ SafeSummary:
Why this is safe:
Key migrations observed in this PR:
Relevant snippets (before vs after):
If you need to remove tomb usage elsewhere in your codebase, here’s a concise migration pattern:
Example migration diff: - type Worker struct {
- Tomb tomb.Tomb
- }
+ type Worker struct {
+ ctx context.Context
+ cancel context.CancelFunc
+ }
- func (w *Worker) Start() {
- w.Tomb.Go(func() error {
- for {
- select {
- case <-w.Tomb.Dying():
- return nil
- default:
- // work
- }
- }
- })
- }
+ func (w *Worker) Start() {
+ w.ctx, w.cancel = context.WithCancel(context.Background())
+ go func() {
+ defer func() { /* cleanup */ }()
+ for {
+ select {
+ case <-w.ctx.Done():
+ return
+ default:
+ // work
+ }
+ }
+ }()
+ }
- func (w *Worker) Stop() error {
- w.Tomb.Kill(nil)
- return w.Tomb.Wait()
- }
+ func (w *Worker) Stop() error {
+ if w.cancel != nil {
+ w.cancel()
+ }
+ return nil
+ }No changes to imports are necessary beyond removing Notes
|
|
Test that fails in windows is not related to this pr and seems to be broken on main |
3c6c2a8 to
1b6fdf8
Compare
avoid running 1 extra goroutine per file
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
1b6fdf8 to
9f08d05
Compare
ptodev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that with less goroutines there is a higher risk of the pipeline being blocked? Would it be more likely to experience slowness due to the larger number of synchronous calls? E.g.:
- If the reading is slow
- If the positions updating is slow (I know this is unlikely, but probably worth considering)
- If the sending is slow

PR Description
During hackaton I decided to explore new api for our internal / vendored tail package.
The package had questionable locking and it is really hard to determine if it was done correctly. The design of it also spawned 2 goroutines per file we tailed and everything was communicated over a channel.
So the design I came up with is a much smaller api surface with only two public methods,
NextandStop.Next will return next line of the file or block until either a file event occurs or the file is stopped.
Stop will cancel any blocked calls to Next and close the file.
I also added offset to
Line, this is the offset into the file right after the line we just consumed. This allows us to remove an additional goroutine intailer.go, the backgroud job updating position and metrics. We can instead do this during the read loop still respecting the interval for updates.I ran two alloy collectors with the same config and tailing 20 files
This will drastically reduce number of goroutines and reduce the memory overhead that comes with that for alloy instances tailing many files.
I also removed usage of loki.EntryHandler and set labels directly on entry we send, this will remove one additional goroutine.
So in total it would be 4 less goroutines per filed taild.
Which issue(s) this PR fixes
Notes to the Reviewer
PR Checklist